Libris Britannia 4

home *** CD-ROM | disk | FTP | other *** search

/ Libris Britannia 4 / science library(b).zip / science library(b) / MATHEMAT / STATISTI / 0910.ZIP / STAT-SAK.DOC < prev next >

Wrap

Text File | 1986-11-15 | 16KB | 450 lines

STAT-SAK The Statistician's Swiss Army Knife Version 2.1 (c) 1985, 1986 "One of many STATOOLS(tm)..." by Gerard E. Dallal 53 Beltran Street Malden, MA 02148 STAT-SAK is the STATistician's Swiss Army Knife: While not the custom tool for any particular job, it carries out a wide variety of miscellaneous tasks that are not easily performed by most large statistical packages. NOTICE Documentation and original code copyright 1986 by Gerard E. Dallal. Reproduction of material for non-commercial purposes is permitted, without charge, provided that suitable reference is made to STAT-SAK and its author. Neither STAT-SAK nor its documentation should be modified in any way without permission from the author, except for those changes that are essential to move STAT-SAK to another computer. DISCLAIMER STATOOLS(tm) are provided "as is" without warranty of any kind. The entire risk as to the quality, performance, and fitness for intended purpose is with you. You assume responsibility for the selection of the program and for the use of results obtained from that program. PAGE 2 INSTALLATION STAT-SAK was written for the IBM-PC but few changes should be needed to install it on another computer. The first DATA statement initializes the variables IIN -- input unit number (screen) IOUT -- output unit number (screen) DESCRIPTION STAT-SAK is a tool for anyone who regularly analyzes data. It was written under the assumption that users would have access to a "comprehensive" micro or mainframe statistical package such as SAS, SPSS-X, BMDP, or SYSTAT. STAT-SAK does not perform calculations that require access to the original observations. STAT-SAK performs the following calculations and analyses: 1. Distributions: a. Normal distribution quantile to probability: upper-tail lower-tail two-tailed probability to quantile [Whenever a quantile of zero is specified, STAT-SAK prompts for three quantities A,B,C and then evaluates A/(B/SQRT(C)).] b. t distribution: quantile to probability probability to quantile c. chi-square distribution: quantile to probability probability to quantile d. F distribution: quantile to probability probability to quantile e. Binomial: Prob (binomial(n,p) <=,=,>= k) f. Poisson: Prob (Poisson(mean) <=,=,>= quantile) 2. Tests of independence/homogeneity of proportions in two dimensional contingency tables: Pearson chi-square statistic, with Yates's continuity correction in the case of a 2 by 2 table. 3. Fisher's exact test for 2 by 2 contingency tables. STAT-SAK G.E. Dallal PAGE 3 4. Mantel-Haenszel test, approximate 95-% confidence interval for common odds ratio using equation 10.22 of Fleiss (1981). 5. McNemar's test: An exact test using the binomial(n,0.5) distribution. 6. Correlation coefficients: a. Test that a population correlation coefficient is zero. b. Construct confidence interval for a single correlation coefficient using a FORTRAN translation of the BASIC program of Maindonald (1984, p.300) based on an approximation from Winterbottom (1980). c. Compare two independent correlation coefficients using Fisher's z transformation. 7. Bartholomew's test for increasing proportions: Exact P-values are given for 3 or 4 proportions. For 5 or more proportions, STAT-SAK gives the P-value appropriate for equal column totals. This value is NOT conservative for all tables. See Bartholomew (1959a,b). Since Bartholomew's test is a pool-adjacent-violators procedure, it can yield "significant" results even when the data show no real trend. In the table 85 15 85 15 85 15 the first row proportion decreases from column 1 to column 2; Bartholomew's test fits the proportion 100 (=85 + 15) / 200 (=100 + 100) to these columns. The first row proportion then increases significantly as we move to column 3 and the test statistic is significant overall. In order to alert users to such potential pitfalls, STAT-SAK reports the Pearson goodness-of-fit statistic for homogeneity of proportions along with the difference between the Pearson statistic and the Bartholomew statistic. Although the reference distribution of the difference is not known, large values are indicative of model failure. A conservative test can be had using the central chi-square distribution with degrees of STAT-SAK G.E. Dallal PAGE 4 freedom equal to the number columns minus 1. 8. Bartholomew's test for increasing normal means: Exact P-values are given for 3 or 4 means. For 5 or more, STAT-SAK gives the P-value based on equal column totals. This value is NOT conservative in all cases. The test for increasing means poses the same problem as does the test for proportions: the data can be significant relative to the null hypothesis of no change even if the monotonicity is seriously violated. Lack-of-fit is assesses by partitioning sums of squares as in a standard analysis of variance. Exact tests for lack of fit are unavailable, however, since the F-ratios constructed in this manner do not follow central F distributions under their respective null hypotheses. We must be content with bounds on the P-value. Let BSS and WMS be the between group sum of squares and within group mean square calculated when performing a standard one-way analysis of variance. Let OBSS be the between group sum of squares calculated under order restriction. Let 'k' be the number of samples and N be the total number of observations. Since (BSS-OBSS)/(k-1) / WMS < = [1] (BSS/(k-1)) / WMS [2] it follows that an upper bound to the P-value can be had by comparing [1] to the percentiles of the F distribution with k-1 numerator degrees of freedom and N-k denominator degrees of freedom. A lower bound to the P-value is obtained by assigning the difference BSS-OBSS to a single degree of freedom, but it must be noted that in the case of two groups with equal means, BSS-OBSS equals BSS with probability 1/2 and 0 with probability 1/2. Hence, for k populations not all of whose means are monotonically nondecreasing, the probability that (BSS-OBSS)/WMS exceeds some particular value is no less than HALF that given by the F distribution with 1 numerator degree of freedom and N-k denominator degrees of freedom. STAT-SAK G.E. Dallal PAGE 5 The summary statistics may be read from an external file, one record per sample, each record containing a sample's mean, SD or SE, and count separated by one or more spaces. (FORTRAN's list directed input is used to read the values.) 9. One sample t test from summary statistics. 10. Two sample t test from summary statistics: includes pooled standard deviation, F-ratio for testing equality of variances, and t tests based on both equal and unequal (using Satterthwaite's approximation) variances. (By entering dummy sample means, this routine can be used to obtain pooled standard deviations from individual standard deviations or standard errors.) STAT-SAK is a dynamic program. If there are any capabilities you would like to have added, drop me a note. I'll consider your suggestions for future versions. The criteria for inclusion are: 1. unavailable in standard statistical packages or not easily obtained (meaning it's almost easier to do by hand) 2. does not require the original observations. FOR THE FREQUENT USER At the prompts "Enter 'Q' to quit, press <Enter> to continue" and "Enter 'R' to return to main menu, press <Enter> to continue", any valid STAT-SAK command may be entered, thereby bypassing the display of the menu. STAT-SAK G.E. Dallal PAGE 6 ALGORITHMS STAT-SAK makes use of the following published routines: Best, D.J. and D.E. Roberts (1975). Algorithm AS 91. The percentage points of the chi-squared distribution. Appl. Statist.,24,385-388. Bhattacharjee, G.P. (1970). Algorithm AS 32. The incomplete gamma integral. Appl. Statist.,19,285-287. Cran, G.W., K.J. Martin and G.E. Thomas (1977). Remark AS R19 and Algorithm AS 109. A remark on algorithms AS 63: The incomplete beta integral, and AS 64: Inverse of the incomplete beta function ratio. Appl. Statist.,26,111-114. Hill, I.D. (1973). Algorithm AS 66. The normal integral. Appl. Statist.,22,424-427. Majumder, K.L. and G.P. Bhattacharjee (1973). Algorithm AS 63. The incomplete beta integral. Appl. Statist.,22,409-411. Odeh, R.E. and J.O. Evans (1974). Algorithm AS 70. The percentage points of the normal distribution. Appl. Statist.,23,96-97. and the author's FORTRAN translation of Pike, M.C. and I.D. Hill (1966). Algorithm 291. Logarithm of the gamma function. Commun. Ass. Comput. Mach.,9,684. REFERENCES Bartholomew, D.J. (1959a). A test of homogeneity for ordered alternatives. Biometrika,46,36-48. ---------------- (1959b). A test of homogeneity for ordered alternatives. II. Biometrika,46,328-335. Fleiss, Joseph L. (1981). Statistical Methods for Rates and Proportions, 2-nd ed. New York: John Wiley & Sons, Inc. STAT-SAK G.E. Dallal PAGE 7 Maindonald, J.H. (1984). Statistical Computation. New York: John Wiley & Sons, Inc. Winterbottom, Alan (1980). Estimation of the bivariate normal correlation coefficient using asymptotic expansions. Comm. in Statist. Simulation and Computation, B9, 599-609. STATOOLS(tm) STAT-SAK is one of many STATOOLS(tm), a set of stand-alone programs designed to fill in some of the gaps left by major statistical program packages such as SAS, SPSS-X, BMDP, and SYSTAT: PC-PITMAN, calculates observed significance levels using recursive relationships to obtain the randomization (permutation) distribution of a number of statistics without directly examining all possible permutations of the data. It performs one and two sample randomization tests and rank tests in the presence of an arbitrary number of ties in the data. PC-SIZE determines the sample size requirements for single factor experiments, two factor experiments, randomized blocks designs, paired t-tests, and comparison of proportions. It can calculate the power of specific sample sizes as well as determine the sample size needed to achieve specific power. PC-MULTI performs multiple comparisons using Tukey's honest significant differences (studentized range statistic). FORGET-IT produces Forget-it Plots (also known as Two-way Plots), a graphical device for displaying the interaction structure of a two-way table. PC-EMS produces tables of expected mean squares for balanced experiments using the Cornfield-Tukey algorithm. PC-PLAN generates randomization plans. STRUCTR fits structural relations when the ratio of error variances is known. STAT-SAK G.E. Dallal PAGE 8 PC-AIP fits additive-in-the-probits models to two-dimensional contingency tables with ordered column classifications. To obtain a set on three diskettes (for the IBM PC and compatibles running DOS 2.0 or later versions) containing the source code, executable files, and user's guides, send a check in the amount of $20 to Gerard E. Dallal 53 Beltran Street Malden, MA 02148 STAT-SAK G.E. Dallal